Rank in Wordlist | Frequency | Word |
---|---|---|
10604 | 45 | 0,4% |
10788 | 44 | 2,4% |
11377 | 41 | 0,7% |
11573 | 40 | 1,5 |
11576 | 40 | 2,5 |
11770 | 39 | 0,1% |
11771 | 39 | 0,5% |
12029 | 38 | 2,5% |
12033 | 38 | 4,5 |
12471 | 36 | 1,6% |
Rank in Wordlist | Frequency | Word |
---|---|---|
1973 | 341 | 50% |
2523 | 264 | 100% |
2543 | 262 | 30% |
2547 | 261 | 10% |
2768 | 239 | 20% |
3073 | 212 | 80% |
3211 | 202 | 40% |
3611 | 176 | 60% |
3770 | 167 | 15% |
3948 | 159 | 25% |
Rank in Wordlist | Frequency | Word |
---|---|---|
44982 | 5 | H&M |
50171 | 4 | AT&T |
52587 | 4 | S&D |
62143 | 3 | PG&E |
63033 | 3 | Ski&Bike |
73491 | 2 | Bee&Bee |
74885 | 2 | Dani&Flo |
77184 | 2 | Jey&em |
80324 | 2 | Pull&Bear |
80407 | 2 | R&B |
Rank in Wordlist | Frequency | Word |
---|---|---|
104149 | 1 | A$AP |
104150 | 1 | A$AP Rocky |
Rank in Wordlist | Frequency | Word |
---|---|---|
7943 | 66 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
69 | 7399 | s'ha |
105 | 4846 | d'un |
144 | 3592 | d'una |
167 | 3175 | s'han |
239 | 2283 | d'aquest |
243 | 2276 | l'Estat |
279 | 2110 | l'any |
382 | 1597 | l'1-O |
510 | 1258 | l'Ajuntament |
519 | 1246 | d'aquesta |
Rank in Wordlist | Frequency | Word |
---|---|---|
21912 | 16 | R+D |
27968 | 11 | PS+Independents |
29827 | 10 | Terceravia+Independents |
33724 | 8 | Demòcrates+Independents |
37074 | 7 | PS+I |
39633 | 6 | 2+1 |
39830 | 6 | Apple TV+ |
45753 | 5 | R+D+I |
45974 | 5 | Sónar+D |
46083 | 5 | UP+DA |
Rank in Wordlist | Frequency | Word |
---|---|---|
80973 | 2 | Sagitari A* |
115042 | 1 | F*cking Money Man |
Rank in Wordlist | Frequency | Word |
---|---|---|
1011 | 670 | https://t |
3871 | 163 | km/h |
8478 | 61 | i/o |
10368 | 47 | l/m2 |
13009 | 34 | 2/4 |
16077 | 25 | 3/24 |
17202 | 23 | c/ |
18366 | 21 | d'https://t |
19767 | 19 | g/l |
20822 | 17 | 24/2015 |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots